An Automatic Segmentation Algorithm Based on Chinese Phoneme

نویسندگان

  • Qian ZHU
  • Yue-Hong CAI
  • Xian-Yi CHENG
چکیده

The primary task of Chinese language processing is to establish efficient and accurate segmentation strategy. With the Chinese’s characteristics of been idea-phonetic language, the paper advances an automatic segmentation algorithm that is based on Chinese phoneme to realize disambiguation. First, the candidate tag set, which consists of ambiguous phrases that result from Chinese polyphones, is built up, and every possible segmentation result of each phrase compose the segmentation tag set, then, the calculation of posterior probability is transformed into solving optimization problem, and with genetic algorithm to get the optimal solution, furthermore, the approach also resolve the sparse data problem in HMM. The experiment shows that with this method to solve the ambiguity caused by polyphones is practicable and has a good effect.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Prostate Cancer Segmentation Using Kinetic Analysis in Dynamic Contrast-Enhanced MRI

Background: Dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) provides functional information on the microcirculation in tissues by analyzing the enhancement kinetics which can be used as biomarkers for prostate lesions detection and characterization.Objective: The purpose of this study is to investigate spatiotemporal patterns of tumors by extracting semi-quantitative as well as w...

متن کامل

Towards A Phoneme Labeled Mandarin Chinese Speech Corpus

Phoneme level transcription of speech corpora is crucial to fundamental speech research and the increasingly interested detection-based automatic speech recognition. Currently, there is no existing phoneme-labeled Mandarin Chinese speech corpus. This paper presents our recent work towards development of such a corpus. Our goal is to label five hours of speech data selected from a Mandarin Chine...

متن کامل

Improved HMM/SVM methods for automatic phoneme segmentation

This paper presents improved HMM/SVM methods for a twostage phoneme segmentation framework, which tries to imitate the human phoneme segmentation process. The first stage performs hidden Markov model (HMM) forced alignment according to the minimum boundary error (MBE) criterion. The objective is to align a phoneme sequence of a speech utterance with its acoustic signal counterpart based on MBE-...

متن کامل

Additional use of phoneme duration hypotheses in automatic speech segmentation

In this paper, we describe a new approach for speaker independent automatic phoneme alignment. Typical algorithms for this task use only phoneme-to-frame similarity measures which are somehow maximised or minimised. In addition to such similarity measures, we use phoneme duration hypotheses generated by the speech synthesis system HADIFIX [1]. For algorithms based on dynamic programming, it is ...

متن کامل

A constrained baum-welch algorithm for improved phoneme segmentation and efficient training

We describe an extension to the Baum-Welch algorithm for training Hidden Markov Models that uses explicit phoneme segmentation to constrain the forward and backward lattice. The HMMs trained with this algorithm can be shown to improve the accuracy of automatic phoneme segmentation. In addition, this algorithm is significantly more computationally efficient than the full BaumWelch algorithm, whi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009